Learning Graph Structure With Parametric and Non-Parametric Models
نویسندگان
چکیده
In discrete undirected graphical models, the conditional independence of the node labels Y is specified by the graph structure. We study the case where there is another input random vector X (e.g. observed features) such that the distribution P (Y | X) is determined by functions of X that characterize the (higher-order) interactions among the Y ’s. The main contribution is to learn the graph structure and the functions conditioned on X at the same time. Parameterizing the graphical models with potential functions might lead to overparameterization. We prove that the discrete undirected graphical models with feature X are equivalent to the multivariate discrete models. The reparameterization of the potential functions in graphical models by conditional log odds ratios of the latter offers advantages in the representation of the conditional independence structure. And the two parameterizations are proved to be equivalent. In addition, the spaces of conditional log odds ratios can be chosen flexibly. They could be linear functional spaces (parametric), or separable Reproducing Kernel Hilbert Spaces determined by kernels (non-parametric). To obtain a sparse estimation of the graph structure, we impose a Structure Lasso (SLasso) penalty on groups of the conditional log odds ratios to learn the graph structure. These groups with overlaps are designed to enforce hierarchical function selection. An efficient gradient descent algorithm is given to estimate the complete model. The global convergence of the algorithm is guaranteed. And a greedy approach is applied when the graph is large. The BGACV tuning method is derived to select the tuning parameter. It achieves satisfactory numerical results in simulation studies. x The asymptotic analysis shows that the SLasso method is consistent in terms of estimating the graph structure. The consistency properties hold for both the parametric models and the nonparametric models. The experiments show that the SLasso method is able to recover the graph structure with increasing sample size. It also outperforms other methods in the simulation studies.
منابع مشابه
Comparing Structure Learning Methods for RKHS Embeddings of Protein Structures
Non-parametric graphical models, embedded in reproducing kernel Hilbert spaces, provide a framework to model multi-modal and arbitrary multi-variate distributions, which are essential when modeling complex protein structures. Non-parametric belief propagation requires the structure of the graphical model to be known a priori. Currently there are nonparametric structure learning algorithms avail...
متن کاملPredictive Ability of Statistical Genomic Prediction Methods When Underlying Genetic Architecture of Trait Is Purely Additive
A simulation study was conducted to address the issue of how purely additive (simple) genetic architecture might impact on the efficacy of parametric and non-parametric genomic prediction methods. For this purpose, we simulated a trait with narrow sense heritability h2= 0.3, with only additive genetic effects for 300 loci in order to compare the predictive ability of 14 more practically used ge...
متن کاملNon-Parametric Bayesian Sum-Product Networks
We define two non-parametric models for Sum-Product Networks (SPNs) (Poon & Domingos, 2011). The first is a tree structure of Dirichlet Processes; the second is a dag of hierarchical Dirichlet Processes. These generative models for data implicitly define a prior distribution on SPN of tree and of dag structure. They allow MCMC fitting of data to SPN models, and the learning of SPN structure fro...
متن کاملRegression Modeling for Spherical Data via Non-parametric and Least Square Methods
Introduction Statistical analysis of the data on the Earth's surface was a favorite subject among many researchers. Such data can be related to animal's migration from a region to another position. Then, statistical modeling of their paths helps biological researchers to predict their movements and estimate the areas that are most likely to constitute the presence of the animals. From a geome...
متن کاملStock price analysis using machine learning method(Non-sensory-parametric backup regression algorithm in lin-ear and nonlinear mode)
The most common starting point for investors when buying a stock is to look at the trend of price changes. In recent years, different models have been used to predict stock prices by researchers, and since artificial intelligence techniques, including neural networks, genetic algorithms and fuzzy logic, have achieved successful re-sults in solving complex problems; in this regard, more exploita...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000